Motif Recognition

نویسندگان

  • Sushain Pandit
  • Susan R. VanderPlas
  • Chaoliang Zhang
  • Susan VanderPlas
چکیده

The problem of recognizing motifs from biological data has been well-studied and numerous algorithms, both exact and approximate, have been proposed to address the underlying issue. We strongly believe that open availability and ease of accessibility of quality implementations for such algorithms are critical to the research community, in order to directly reproduce and utilize the results from other studies, so as not to reinvent the wheel. Moreover, it is also important for the implementation to be as generic as possible so that any researcher can to extend it with minimal effort to test a newly implemented algorithmic extension or heuristic. With this motivation, we choose to focus an existing algorithm, PatternBranching and, to a lesser degree, Yang2004. We analyze these approaches for minor heuristical changes & speed-ups by adjusting certain thresholds, and finally, implement the variants in high-level languages (Java and C) using thought through programming practices and generic, extensible interfaces. We also analyze the performance of PatternBranching using a synthetically generated testsuite for a variety of sequence lengths and report the results. Code from this project will be made freely available online to the research community.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A MODEL FOR THE BASIC HELIX- LOOPHELIX MOTIF AND ITS SEQUENCE SPECIFIC RECOGNITION OF DNA

A three dimensional model of the basic Helix-Loop-Helix motif and its sequence specific recognition of DNA is described. The basic-helix I is modeled as a continuous ?-helix because no ?-helix breaking residue is found between the basic region and the first helix. When the basic region of the two peptide monomers are aligned in the successive major groove of the cognate DNA, the hydrophobi...

متن کامل

Assessing Local Structure Motifs Using Order Parameters for Motif Recognition, Interstitial Identification, and Diffusion Path Characterization

Citation: Zimmermann NER, Horton MK, Jain A and Haranczyk M (2017) Assessing Local Structure Motifs Using Order Parameters for Motif Recognition, Interstitial Identification, and Diffusion Path Characterization. Front. Mater. 4:34. doi: 10.3389/fmats.2017.00034 Assessing Local Structure Motifs Using Order Parameters for Motif Recognition, Interstitial Identification, and Diffusion Path Characte...

متن کامل

Graphical approach to weak motif recognition.

We address the weak motif recognition problem in DNA sequences, which extends the general motif recognition to more difficult cases, allowing more degenerations in motif instances. Several algorithms have earlier attempted to find weak motifs in DNA sequences but with limitations. In this paper, we propose a graph-based algorithm for weak motif detection, which uses dynamic programming approach...

متن کامل

Discriminant Analysis and Its Application in DNA Sequence Motif Recognition

Identification of functional motifs in a DNA sequence is fundamentally a statistical pattern recognition problem. Discriminant analysis is widely used for solving such problems. This paper will review two basic parametric methods: LDA (linear discriminant analysis) and QDA (quadratic discriminant analysis). Their usage in recognition of splice sites and exons in the human genome will be demonst...

متن کامل

Negative in vitro selection identifies the rRNA recognition motif for ErmE methyltransferase.

Erm methyltransferases modify bacterial 23S ribosomal RNA at adenosine 2058 (A2058, Escherichia coli numbering) conferring resistance to macrolide, lincosamide, and streptogramin B (MLS) antibiotics. The motif that is recognized by Erm methyltransferases is contained within helix 73 of 23S rRNA and the adjacent single-stranded region around A2058. An RNA transcript of 72 nt that displays this m...

متن کامل

Sequence Analysis and Phylogenetic Profiling of the Nonstructural (NS) Genes of H9N2 Influenza A Viruses Isolated in Iran during 1998-2007

The earliest evidences on circulation of Avian Influenza (AI) virus on the Iranian poultry farms date back to 1998. Great economic losses through dramatic drop in egg production and high mortality rates are characteristically attributed to H9N2 AI virus. In the present work non-structural (NS) genes of 10 Iranian H9N2 chicken AI viruses collected during 1998-2007 were fully sequenced and subjec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014